Skip to content

TAP-MYSQL: Support full GTID set tracking for multi-source MySQL repl…#1271

Open
DrashtiChhatralia wants to merge 1 commit intotransferwise:masterfrom
DrashtiChhatralia:1268_GTID_set_issue
Open

TAP-MYSQL: Support full GTID set tracking for multi-source MySQL repl…#1271
DrashtiChhatralia wants to merge 1 commit intotransferwise:masterfrom
DrashtiChhatralia:1268_GTID_set_issue

Conversation

@DrashtiChhatralia
Copy link
Copy Markdown

@DrashtiChhatralia DrashtiChhatralia commented Apr 16, 2026

Context

The current use_gtid implementation for MySQL only tracks a single server UUID from @@GLOBAL.gtid_executed.

In multi-source MySQL setups or after server migrations, the GTID set spans multiple UUIDs (e.g. uuid1:1-5291,uuid2:1-81). The existing behavior:

Drops non-matching UUIDs from the state
Uses only one UUID for auto_position on resume, causing MySQL to replay events from other UUIDs → duplicate data/reprocessing data

This PR updates the implementation to support full GTID set tracking, ensuring correct resume behavior and eliminating duplicate data issues.

MariaDB and non-GTID MySQL pipelines remain unaffected.

Types of changes

What types of changes does your code introduce to PipelineWise?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue) [optional]
  • New feature (non-breaking change which adds functionality) [optional]
  • Documentation Update (if none of the other choices apply) [optional]

Checklist

  • I have read the CONTRIBUTING doc
  • Description above provides context of the change
  • I have added tests that prove my fix is effective or that my feature works
  • Unit tests for changes (not needed for documentation changes)
  • CI checks pass with my changes
  • Relevant documentation is updated including usage instructions

@DrashtiChhatralia DrashtiChhatralia requested a review from a team as a code owner April 16, 2026 13:28
Copilot AI review requested due to automatic review settings April 16, 2026 13:28
@platon-github-app-production
Copy link
Copy Markdown

Comment /request-review to automatically request reviews from the following teams:

You can also request review from a specific team by commenting /request-review team-name, or you can add a description with --notes "<message>"

💡 If you see something that doesn't look right, check the configuration guide.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Changes: Bugfix (1), Test improvement (1), Documentation update (1)

This PR updates tap-mysql’s MySQL GTID handling to persist and resume from the full GTID set (multi-UUID), which prevents reprocessing/duplicate events in multi-source or post-migration topologies.

Changes:

  • Return/store the full MySQL @@GLOBAL.gtid_executed set (whitespace-normalized) rather than filtering to a single UUID.
  • Accumulate MySQL GTID state as a growing GTID set during binlog consumption, and normalize legacy/contaminated state before resuming.
  • Extend unit + integration coverage and update README documentation for GTID state behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
singer-connectors/tap-mysql/tap_mysql/sync_strategies/binlog.py Implements full GTID set tracking/normalization and GTID-set merging during MySQL binlog sync.
singer-connectors/tap-mysql/tests/unit/sync_strategies/test_binlog.py Adds/updates unit tests for full-set GTID fetch, normalization, merging, and pre-bookmark normalization.
singer-connectors/tap-mysql/tests/integration/test_tap_mysql.py Updates GTID integration assertions for MySQL’s shared GTID set behavior across streams.
singer-connectors/tap-mysql/README.md Documents MySQL full GTID set state semantics and separates MySQL vs MariaDB GTID state examples.
Comments suppressed due to low confidence (1)

singer-connectors/tap-mysql/README.md:360

  • The MariaDB GTID format described above (domain-serverid-sequence with hyphens) doesn’t match the example values below, which use colon-separated 0:...:.... Please update the example to use the correct MariaDB GTID format (e.g. 0-<server_id>-<sequence>), consistent with the tap’s state and unit tests.
{
  "bookmarks": {
    "example_db-table1": {"log_file": "mysql-binlog.0003", "log_pos": 3244, "gtid": "0:364864374:599"},
    "example_db-table2": {"log_file": "mysql-binlog.0001", "log_pos": 42, "gtid": "0:364864374:375"},
    "example_db-table3": {"log_file": "mysql-binlog.0003", "log_pos": 100, "gtid": "0:364864374:399"}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread singer-connectors/tap-mysql/tap_mysql/sync_strategies/binlog.py Outdated
Comment thread singer-connectors/tap-mysql/tap_mysql/sync_strategies/binlog.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread singer-connectors/tap-mysql/tap_mysql/sync_strategies/binlog.py
Comment thread singer-connectors/tap-mysql/tap_mysql/sync_strategies/binlog.py
@DrashtiChhatralia
Copy link
Copy Markdown
Author

Hello team, I’ve submitted this PR for review. Could a maintainer please review it?

@DrashtiChhatralia
Copy link
Copy Markdown
Author

DrashtiChhatralia commented Apr 28, 2026

/request-review Could a maintainer please review it?

@platon-github-app-production
Copy link
Copy Markdown

Success 🎉 The review request was sent to the following teams:

If you see something that doesn't look right, follow this doc to improve our slack channel mapping. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants